Social coevolution and Sine chaotic opposition learning Chimp Optimization Algorithm for feature selection

Feature selection is a hot problem in machine learning. Swarm intelligence algorithms play an essential role in feature selection due to their excellent optimisation ability. The Chimp Optimisation Algorithm (CHoA) is a new type of swarm intelligence algorithm. It has quickly won widespread attention in the academic community due to its fast convergence speed and easy implementation. However, CHoA has specific challenges in balancing local and global search, limiting its optimisation accuracy and leading to premature convergence, thus affecting the algorithm’s performance on feature selection tasks. This study proposes Social coevolution and Sine chaotic opposition learning Chimp Optimization Algorithm (SOSCHoA). SOSCHoA enhances inter-population interaction through social coevolution, improving local search. Additionally, it introduces sine chaotic opposition learning to increase population diversity and prevent local optima. Extensive experiments on 12 high-dimensional classification datasets demonstrate that SOSCHoA outperforms existing algorithms in classification accuracy, convergence, and stability. Although SOSCHoA shows advantages in handling high-dimensional datasets, there is room for future research and optimization, particularly concerning feature dimensionality reduction.


Related work
High-dimensional optimization problems are widely found in engineering applications and scientific computing, for example, wind turbine fleet optimization 24 and automobile side impact optimization 25 .However, Swarm intelligence algorithms mainly suffer from poor solution quality and the tendency to fall into local optima in high-dimensional optimization problems.Therefore, many researchers have proposed improvement strategies in many aspects to avoid falling into local optima and finding globally optimal solutions.Therefore, many researchers have proposed improvement strategies to avoid falling into local optimum and accelerate the convergence speed.Table 1 lists some Swarm intelligence algorithms for solving high-dimensional optimization problems.
Neggaz 26 proposed ISSAFD, which relies on the use of the sine cosine algorithm and perturbation operators to improve the performance of the slap swarm algorithm.Hussain 28 proposed the SCHHO algorithm, which enhances the development by fusing the sine-cosine algorithm and Harris Hawks optimisation to dynamically adjust the candidate solutions to avoid the problem of solution stagnation in HHO.Braik 27 proposed Chaotic sequence and Lévy flight with BCSA (CLBCSA).CLBCSA combines chaotic sequences and Lévy flights to enhance the algorithm's local exploitation capabilities while maintaining its global search capabilities.This combination strategy aims to improve the algorithm's ability to avoid falling into local optima and to converge quickly to the global optimum.Yang 35 proposed the Binary Golden Eagle Optimizer algorithm combined with the Initialization of Feature Number Subspace (BGEO-IFNS).With the IFNS approach, BGEO-IFNS can initially generate higher-quality populations, improving the algorithm's ability to search in a high-dimensional search space and the final optimisation performance.Nadimi-Shahraki 36 proposed the E-WOA algorithm, which solves the feature selection problem using a pooling mechanism and three effective search strategies.Finally, the E-WOA algorithm was applied to COVID-19 disease feature selection.Rajalaxmi 30 proposed the BIGWO algorithm.First, the optimal solution is solved by the GWO algorithm; then, the optimal subset of features is obtained by binary conversion of the optimal solution with V-and S-shaped functions.Gad 31 proposed the iBSSA algorithm, firstly, to improve the local exploration capability using a local search algorithm; secondly, to improve the global search capability using a roaming agent approach; and finally, to obtain the optimal feature subset by a binary transformation of the optimal solution using V-and S-shaped functions.Wang 32 proposed the ABGWO algorithm.First, an adaptive coefficient is introduced to improve the local exploration capability and global search capability of the GWO algorithm.Finally, the optimal feature subset is obtained by binary conversion of the optimal solution by a Sigmoid transformation function.Long 33 proposed LIL-HHO.First, the escape energy E is improved by a sinusoidal function to achieve a good transition from the exploration phase to the exploitation phase.Second, the search accuracy is enhanced by introducing the individual's best position for each eagle.Third, crystal imaging learning is used to eliminate the local optimum and thus obtain the global optimum solution.Finally, experiments prove that this algorithm outperforms the comparison algorithm.Peng 44 proposed the EHHO algorithm.First, the optimal solution is obtained by optimizing the HHO algorithm through a hierarchical structure.Then, the optimal subset of features is obtained by binary conversion of the optimal solution through a V-transformation function.Chang 45 proposed by the MSGWO algorithm.First, a Random Opposition-based Learning (ROL) strategy is applied to improve the population quality in the initialisation phase.Secondly, the convergence factor is adjusted nonlinearly to reconcile global exploration and local exploitation capabilities.Finally, a twostage mixed-variance operator is introduced to increase population diversity and balance the exploration and exploitation capabilities of GWO.Houssein 46 proposed the mSTOA algorithm.The algorithm uses a balanced exploration/exploitation strategy, an adaptive control parameter strategy, and a population reduction strategy to solve the problem of poor convergence and improve classification accuracy.Duan 47 proposed the cHGWO-SCA algorithm.First, the SCA algorithm is used to update the position of the head wolf; second, the grey wolf is guided to search for prey using moderate value weights and individual optimal positions to obtain the global optimal solution.Nadimi-Shahraki 43 proposed by the MFO-SFR algorithm improves the performance of the search process through the stagnation finding and replacing (SFR) strategy.Secondly, archives are used to enrich the diversity of the population.Finally, experiments prove that the algorithm is effective.
Wang 29 proposed the BChOA algorithm.First, the optimal solution is found by the ChOA algorithm.Then, the optimal subset of features is obtained by binary transformation of the optimal solutions by V-and S-type functions.Pashaei 37 proposed the BCHoA-C algorithm.Firstly, the MRMR algorithm ranks the feature set and filters a subset of features with high relevance and low redundancy.Secondly, the CHoA algorithm finds the optimal solution.Finally, the optimal subset of features is obtained by binary conversion of the optimal solution using V-type and Sigmoid conversion functions.Khishe 49 proposed OBLChOA.This algorithm gets the global optimal solution using a greedy search and backward learning strategy.Jia 39 suggested EChOA, which firstly initializes the population using polynomial mutation; secondly, calculates the gap between the lowest social status chimp and the leader chimp via Spearman's rank correlation coefficient; and finally, uses the beetle's tentacle operator to jump out of the local optimum to obtain the global optimum solution.Liu 40 proposed ULChOA, an algorithm that updates the location of prey using a generic learning mechanism that provides a dynamic balance between the exploration and exploitation phases.The algorithm was finally demonstrated to be effective through experiments.Kaur 34 proposed the SChoA algorithm.The algorithm solves the slow convergence by improving the Chimp's search and updating the equation with a sine cosine function to obtain the optimal solution.Gong 38 proposed the NChOA algorithm, which uses niching techniques, individual optimal techniques for PSO, and local search techniques to improve search efficiency and increase convergence speed.Wang 41 proposed AChOA, initialising the population through a Tent chaotic mapping.Secondly, it uses an adaptive non-linear convergence factor and adaptive weight coefficients to improve population diversity.Finally, a Lévy flight strategy is applied to jump out of the local optimum.The method is experimentally proven to be effective.Fahmy 42 proposed ECH3OA, which obtains the global optimal solution by combining a fusion of the enhanced Chimp Optimization Algorithm (ChOA) and Harris Hawkes Optimization Algorithm (HHO).Bo 48 proposed the GSOBL-ChOA  48 Greedy choices and oppositional learning algorithm.Firstly, the convergence rate is accelerated by applying the OBL technique in the exploration phase of ChOA.Second, a greedy selection strategy is used to find the optimal solution.Although the swarm intelligence algorithms mentioned above improve search efficiency and increase convergence speed, they still suffer from unbalanced exploration and exploitation, poor solution quality, and tend to fall into local optimality.According to our study, enhancing local exploration, increasing population diversity, and finding globally optimal solutions have become essential for studying swarm intelligence algorithms in high-dimensional optimization [50][51][52] .Therefore, this paper focuses on the location update equations and global optimization mechanisms in the CHoA algorithm.It proposes a Chimp optimization algorithm with a coevolutionary strategy and Sine chaotic opposition learning and also applies it to the high-dimensional classification feature selection problem.

Chimp Optimization Algorithm
The CHoA algorithm is a swarm intelligence optimization algorithm proposed to simulate the hunting behaviour of a chimp in nature.The chimp hunting process is generally divided into chasing and attacking the prey.The standard CHoA algorithm selects an attacker (first optimal solution), a barrier (second optimal solution), a chaser (third optimal solution), and a driver (fourth optimal solution) to discover potential prey locations jointly.In the search spaces, the chimp group mainly uses the four best-performing chimps to guide the other chimps toward their optimal areas, while the four chimps -attacker, barrier, chaser, and driver -predict the possible locations of the captured objects during the continuous iterative search by guiding the continuous search for the global optimal solution.The mathematical model of a chimp chasing prey during the search process is, therefore, as follows: In Eq. (1), X prey the position vector of the prey, X chimp the position vector of the current individual chimp, t the number of current iterations, and a, C, m the coefficient vector, which is calculated as follows: Among them, r 1 and r 2 are random numbers between [0, 1] , respectively.f is the convergence factor whose value decreases non-linearly from 2.5 to 0 as the number of iterations increases.t max is denoted as the maximum number of iterations.a is a random vector that determines the distance between the chimp and the prey, with a random number of values between −f , f .C is the chaotic vector generated by the chaotic mapping.C is the control coefficient for the Chimp expulsion and prey chasing, and its value is a random number between [0, 2].
The mathematical model for the chimp attack on prey is as follows: From Eqs. ( 6) to (11), X(t) is the position vector of the current Chimp, X attac ker is the position vector of the attacker, X barrier is the position vector of the barrier, X chaser is the position vector of the chaser, X driver is the posi- tion vector of the driver and X chimp (t + 1) is the updated position vector of the current Chimp.X chimp (t + 1) is the chaotic mapping, which is used to update the position of the solution.From Eq. (10), it is clear that individual chimp positions are estimated from the four best individual chimps, while the other Chimps update their positions randomly.From Eq. ( 11), it can be seen that to simulate the social behaviour of chimps attacking their prey, let u be a random number between [0, 1] .When u < 0.5 , Eq. ( 10) is used for the position update.When u ≥ 0.5 , loca- tion updates using chaotic process mapping were employed, and this approach determined the chimp's attack behaviour randomly. (1)

Proposed improved chimp optimization algorithm
The traditional CHoA has several limitations, such as falling into local optima, slow convergence, and imbalanced development.Therefore, our work aims to develop new variants of CHoA.The proposed algorithm does not affect the basic framework of the CHoA algorithm.Still, it only introduces a social coevolution strategy into the CHoA location equation to overcome the blindness of search and dynamically adjust the balance between local exploration and global exploitation.The Sine chaotic opposition learning mechanism improves the full search capability, enabling the algorithm to jump out of the local optimum solution.This is described in detail below.

Social coevolution strategy
From Eq. ( 11), individual chimp positions are determined jointly by attackers, barriers, chasers, and drivers or by chaotic process mapping for position updating.This equation update has the following disadvantages: • When the four key individuals in the population, attackers, barriers, chasers, and drivers, are all caught in a local optimum, the entire population risks tilting towards a locally optimal solution, significantly constraining the algorithm's global search capability.• Suppose the attackers, barriers, chasers and drivers, unfortunately, fall into the confines of the local optimal solution during the iterative process.In that case, the whole chimpanzee population may quickly fall into the trap of this local optimum.This severely limits the algorithm's convergence efficiency and slows its exploration towards the global optimum.• Chaotic_value , as a randomly generated vector, carries a certain degree of randomness in its triggering mechanism, with about half the probability of being able to be activated.However, this randomness also leads to a need for more stability.Although Chaotic_value allows individuals to escape from the local optimal solution to a certain extent, it does not fully consider the interactions and information exchanges within the population during the execution of the optimal search.In particular, Chaotic_value fails to fully utilise the potential of learning and acquiring positional information from other individuals in the population, which somewhat limits its efficacy in improving search efficiency and optimising global solutions.
Therefore, to address the defects in the principle of the above algorithm and to enhance the local exploitation capability of the chimp optimization algorithm and the ability to communicate among chimp individuals, this paper proposes to update the chimp individual positions using a social coevolution strategy with the following equation.
In Eq. ( 12), r 3 is a random number between 2 is a co-occurrence quantity, which represents the relationship characteristics of chimp i and i − 1 in the chimp population.R is the benefit factor.This rep- resentation of the benefit factor R allows for an adequate representation of whether individual chimps benefit partially or fully from the interaction.When R = 1 , it means that chimp i and chimp i − 1 gain a small benefit from interacting with each other.When R = 2 , it means that chimp i and chimp i − 1 greatly benefit from interacting with each other.
The r 3 • (X attacker − C • R) is the socially coextensive component, which not only allows the optimal chimp ( X attacker ) to exchange information with the general chimp but also allows each chimp to exchange informa- tion with neighbouring chimp.This approach enables the chimp to no longer search singularly around a circle defined by attackers, barriers, chasers, and drivers.Furthermore, Eq. ( 12) leads the individual chimp to steadily converge to the optimal value, which improves the algorithm's search accuracy and speed, obtaining the desired search results.

Sine chaotic mapping strategy
Sine chaotic mapping Chaos 53 is a stochastic, non-periodic, and non-convergent approach found in non-linear dynamical systems.In mathematics, chaotic systems are a source of randomness.The main idea is to exploit the random and ergodic nature of chaotic motion by mapping variables into the interval of values in chaotic variables and finally linearly transforming the resulting solution into the space of optimized variables.The standard chaotic mappings in the optimization field are logistic mapping 54 , Tent mapping 55 , etc. Sine chaotic mapping can help the algorithm jump out of the boundaries of local extreme points due to its ability to search in a wide range.Therefore, using this advantage of sine chaotic mapping, the algorithm can explore the solution space more deeply and reduce the risk of falling into sub-optimal solution regions, improving the solution's quality and the optimisation process's overall performance 56 .Sine mappings are calculated as follows: In equation ( 13), a ∈ (0, 1] is the control parameter and S x j i ∈ [−1, 1] is the chaotic sequence value.

Opposition-based Learning
Opposition-Based Learning (OBL) is a mathematical method proposed by Tizhoosh 57 , the essential principle of which is to select the best solution for the next iteration by estimating and comparing the feasible solution with the inverse solution.Rahnamayan 58 proposed an opposing learning strategy for the neighbourhood centre of gravity, allowing the particle swarm to take in the group search experience and increasing population diversity.www.nature.com/scientificreports/Yin 59 proposed that introducing adversarial learning competition for local search in the primary particle swarm algorithm can improve the algorithm's performance in solving high-dimensional optimization.All of these scholars have made it possible for the reverse solution to reach the vicinity of the optimal solution more accurately by using the contrastive learning approach to improve intelligent optimization algorithms.Thus, the computational model of opposing learning is specified as follows: Among them, X i = x 1 i , x 2 i , . . ., x j i , i = 1, 2, . . ., N; j = 1, 2, . . ., D ,N is the number of populations and D is the dimensional search space.X i is a point in D dimensional space.X i is the reverse of X i .x

Sine chaotic oppositional learning
From the ChoA algorithm description 14 , in performing global exploration, the Chimp algorithm first updates the dimensional information of the solution.Subsequently, it evaluates the fit of the objective function.Next, the fitness value of the current position is compared to the fitness of the previous position to determine whether that position is used for the next iteration.However, as the dimensionality increases, the algorithm may face a decrease in the diversity of the population at a later stage of the iteration, which increases the risk of falling into a local optimum.This diversity reduction directly affects the algorithm's convergence speed and the final solution's accuracy.At the same time, it is clear from the descriptions in "Sine chaotic mapping" and "Opposition-based Learning" that Sine chaotic mappings are random and can perform searches globally.Oppositional learning can increase the diversity of the population and speed up the algorithm's convergence.
Therefore, this paper proposes a strategy combining Sine chaos mapping and oppositional learning.Firstly, the goal is to reduce the mutual interference between dimensions.Secondly, it will increase the diversity of the algorithm's search positions and help the algorithm expand the exploration area so that the algorithm gains the ability to get rid of local extremes.Its computational model is: From Eq. ( 15), compared with general opposition learning, this paper uses Sine opposition learning to perturb the ChoA algorithm to enhance population diversity to increase the likelihood of the algorithm jumping out of the local optimum and, to a certain extent, reduce the likelihood of the algorithm falling into the local optimum, thus improving the optimization efficiency of the algorithm.
Although a reverse solution is generated by Eq. 15, this reverse solution is not necessarily better than the original solution.Therefore, a greedy selection strategy is introduced to choose whether or not to replace the original solution with the reverse solution, i.e. the replacement is made only if the reverse solution has a better fitness value.This approach allows the best position to be introduced into the next iteration with the following computational model: Through Eqs. ( 15) and ( 16), it can be seen that the Sine dimension-by-dimension opposition learning strategy can be used by generating opposition solutions far from the local extrema when the algorithm falls into a local optimum.The greedy strategy selects the individual with better fitness among the original and inverse solutions, thus generating chimpanzee individuals with better positions.This effectively avoids the decline of population diversity in the late iterations and enhances the algorithm's global optimality-finding ability.At the same time, a progressively smaller search space can be obtained through the dynamic boundary search mode employed by Sine's dimension-by-dimension opposition learning.This approach can facilitate the evolution of the CHoA algorithm towards the target position according to different requirements during the iterative process, allowing the algorithm to obtain a better convergence rate.

SOSCHoA implementation step
Through the above description, this paper combines the social coevolution strategy, chaotic mapping theory, and the dimension-by-dimension opposition learning strategy to optimize the optimization seeking efficiency and improve the algorithm's stability to expect better optimization results during each iteration.Therefore, combining the above improvement methods, the SOSCHoA algorithm pseudo-code is given below, with the following steps: (14) Algorithm 1 SOSCHoA: the social coevolution and Sine chaotic opposition learning chimp optimization algorithm Compared with the basic CHoA algorithm, the SOSCHoA algorithm has the following features: • the SOSCHoA algorithm does not change the framework of the basic CHoA algorithm but only introduces new operators; • the SOSCHoA algorithm updates the attack prey position through a social coevolution strategy to enhance the local exploration ability; • the current optimal individual performs a dimension-by-dimension Sine chaos-based opposition learning strategy, enhancing the diversity of the population and reducing the probability of the algorithm falling into a local optimum; • through a greedy mechanism, allowing the target location to lead in obtaining the global optimal solution.

Proof of convergence of the SOSCHoA algorithm
Similar to the convergence analysis of most metaheuristic algorithms, we use the deterministic derivation of the SOSCHoA algorithm to analyze its convergence.It is important to note that the convergence proof does not necessarily guarantee that the algorithm converges to the global optimal solution.Since the CHoA algorithm is an intelligent population algorithm, the following theorem follows.
Theorem 1 If the CHoA algorithm based on general opposite learning converges, then the SOSCHoA algorithm is also convergent.
Proof let X i (t) and X i (t) be the current and opposing solutions in generation t. x j i (t) and x j i (t) are the values of X i (t) and X i (t) in the j dimension, respectively, and the complete solution to the problem is x * , which by the conditions in Theorem 1 has for the solution x j i (t) in the t generation of the population: www.nature.com/scientificreports/Since,lb j (t) = min x j i (t) ,ub j (t) = max x j i (t) , it follows: At t generation, the current opposing solution generated by the Sine chaotic opposite learning strategy shown in Eq. ( 19) is: When t → ∞ , from Eq. ( 19): From Eq. ( 20), when x j i (t) converges to x * j , the dyadic solution based on Sine chaotic opposition learning strategy also converges to x * j .Therefore, if the CHoA algorithm based on the general dyadic solution converges, the SOSCHoA algorithm also converges.

Time complexity analysis of the SOSCHoA algorithm
The time complexity indirectly reflects the algorithm's convergence speed.In the CHoA algorithm, the time required to initialize the parameters (population size N, D search space dimension, a, m, f coefficients, etc.) is assumed to α 1 , the time required to update the positions of other chimpanzee individuals in the population in each dimension according to Eq. ( 11) is α 2 and the time required to solve the target fitness function is f (D) , then the time complexity of ChOA is In the SOSCHoA algorithm, the time required to initialize the parameters is consistent with the standard ChOA.In the loop phase of the algorithm, let the time required to execute the social symbiosis strategy of α 3 , let the time required to execute the dimension-by-dimensional Sine chaotic opposition learning strategy of α 4 , and the time required to execute the greedy mechanism of α 5 , then the time complexity of SOSCHoA is The SOSCHoA proposed in this paper is consistent with the basic ChOA time complexity.
In summary, the improvement strategy proposed in this paper for ChOA does not increase the complexity of the time.

SOSCHoA based feature selection
The feature selection problem for high-dimensional datasets is a binary optimization problem 34 ; the solution space is limited to {0, 1} .For SOSCHoA, it is first necessary to convert continuous optimization values to binary.A feature selection solution can be represented as a searching individual in the SOSCHoA algorithm; the individual dimension is represented as the number of features in the original dataset, and the individual x j i ∈ {0, 1} .The cod- ing rules are: When x j i = 1 , feature j in individual i was selected; when x j i = 0 , it means that feature j in individual i was not selected.For example, Table 2 represents a feature selection solution with an individual dimension of 9, corresponding to an original dataset with nine feature attributes.Of these, indicates that individual i selected features 1, 2, 4, 7 and 8 in the optimal feature subset solution.x 3 i = x 5 i = x 6 i = x 9 i = 0 , this indicates that individual i selected features 3, 5, 6, 9 not selected in the optimal feature subset solution.The classifier will use features 1, 2, 4, 7, and 8 as classification data 60 .
At the same time, SOSCHoA converts the continuous optimized form to binary form using a conversion function with the following specific functional equation: (17)   www.nature.com/scientificreports/ Where the value of the position in feature j in individual i is x j i .At the same time, the feature selection problem for a dataset is a multi-objective optimization problem, requiring the maximum possible data classification accuracy while minimizing the number of features selected.To balance the number of features selected (minimization) and the classification accuracy (maximization), the fitness function is defined as: From Eq. ( 25), γ R (D) denotes the classification error rate (in this paper, the K-Nearest Neighbor (KNN, k=5) algorithm is used to evaluate the classification accuracy of the selected feature subset), |Selected| denotes the number of selected feature sets, and |ALL| denotes the number of original feature sets.α denotes the weighting factor,α ∈ [0, 1],β = 1 − α .Since Eq. ( 25) plays a large role in the SOSCHoA algorithm searching for the optimal feature subset, it is set to 0.99.

Experimental validation and analysis
To verify the degradation and classification performance improvement of SOSCHoA for high-dimensional classification data.This section conducts a series of comparison experiments, and the detailed description of the high-dimensional classification dataset used is shown in Table 3.The settings of the comparison algorithms used are presented in Table 4. Second, the classification performance is analyzed, and the number of features in SOSCHoA is investigated.Third, experimental results on classification performance, number of features, and running time are analyzed and evaluated for SOSCHoA versus other heuristic algorithms.Finally, the convergence performance of the compared algorithms and the Wilcoxon rank sum test is verified.

Description of the experimental dataset
The experimental datasets were selected from the internationally well-known ASU high-dimensional dataset (https:// jundo ngl.github.io/ scikit-featu re/ datas ets.html).Table 3 briefly describes these datasets, with the number of samples ranging from 62 to 210, the number of features ranging from 325 to 22,283, and the number of class labels ranging from 2 to 11.When the number of class labels is two categories, it is considered dichotomous.When the number of class labels is more significant than two classes, it is considered multiclassification.

Experimental settings
To evaluate the impact of the proposed strategy mechanism on the classification performance of high-dimensional microarray data during feature selection, three sets of comparison experiments were designed as follows.
In the first set of comparison experiments, the classification performance of SOSCHoA was compared with that of the CHoA algorithm 14 and the DLFCHOA algorithm 17 .In the second set of comparison experiments, SOSCHoA was compared with PIL-BOA 18 , BBOA 19 , LMRAOA 20 , VGHHO 21 of different opposing learning element heuristics for comparison of fitness values and classification performance.In the third set of comparison experiments, SOSCHoA was compared with FA 22 , FPA 10 , WOA 12 , HHO 13 , MRFO 23 for comparison of fitness values and classification performance.The experimental framework is shown in Fig. 1.
Figure 1 shows that SOSCHoA is run on the training dataset to generate a subset of candidate features.Secondly, the training and test sets are transformed into new training and test sets by removing unselected features.Finally, the test dataset is fed into the classifier to verify the classification performance of the selected feature subset against the feature subset selected by the comparison algorithm.(24)     thus improving the efficiency and effectiveness of group collaboration.In addition, the SOSCHoA algorithm can help the group jump out of the local optimal solution and search for the global optimal solution further.

Comparison of SOSCHoA with CHoA and DLFCHOA classification performance
In Table 5, AccMean (%), maxAcc (%), and SD denote the average classification accuracy, best classification accuracy, and standard deviation for each algorithm over 30 independent runs on each classification dataset.In  6, d and time(/s) denote the average number of features selected and the average running time for each algorithm over 30 independent runs.
As seen from Table 5, SOSCHoA achieves higher average classification accuracy on all test datasets than the CHoA algorithm.Also, compared to the DLFCHOA algorithm, SOSCHoA achieves higher average classification accuracy on all test datasets except for the nci9 dataset.Regarding standard deviation, SOSCHoA is optimal compared to CHoA on all test datasets except warpPIE10P.SOSCHoA is optimal compared to DLFCHOA on all test datasets except on Carcinom.In conclusion, SOSCHoA showed better performance than DLFCHOA and ChoA algorithms in terms of both average classification accuracy and robustness.
As can be seen from Table 6, SOSCHoA has the highest average number of features selected among the three algorithms at 66.66, which is 12.31 and 21.59 higher than DLFCHOA and ChoA, respectively.This indicates that SOSCHoA still needs to improve its feature selection capability and optimise the number of selected features.

Analysis of CHoA algorithm improvement strategies
The data in Table 3 were selected for classification accuracy and adaptation value experiments to analyse the improved strategies' impact on the algorithms' performance.The CHoA algorithm that only employs the social coevolution strategy (SOCHoA) is compared with the CHoA algorithm that escapes the local optimal solution using the Sine chaotic opposing learning strategy (SCHoA).The parameters of the above two algorithms are the same as in "Experimental settings".
The comparison results from Table 7 show that the operator's classification accuracy and average adaptation value using the social Coevolution strategy are significantly better than the SOSCHoA algorithm on the warp-PIE10P, Carcinom and nci9 datasets.The operator's classification accuracy and average adaptation value using the Sine chaotic Opposing learning strategy are significantly better than the SOSCHoA algorithm on the lung and Lung_Cancer datasets.Meanwhile, by combining the results in Tables 5 and 7, it can be seen that SOCHoA and SCHoA classification accuracy and average adaptation value perform poorly on the lung_discrete dataset, which suggests that only adopting the Social Coevolution strategy or only the Sine Chaos Opposing Learning strategy can be of significant help in improving the performance of the CHoA algorithm.
In conclusion, the results of SOSCHoA are better than the two sub-algorithms of SOCHoA and SCHoA.The comparison results show that both improvement strategies play a role in improving the algorithm, and their promotion can be effectively combined without being suppressed by either operator, which confirms the effectiveness of the improvement strategies for the algorithm.Therefore, the SOSCHoA algorithm can improve the CHoA algorithm, strengthen its global investigation and local mining ability, accelerate the convergence speed, eliminate the local optimum, and achieve higher classification accuracy and smaller optimal adaptation value.

Analysis of the impact of opposing learning strategies on classification performance
To verify the superiority of SOSCHoA, algorithms with different opposing learning strategies were selected to compare and validate the classification performance of the test data, specifically PIL-BOA, BBOA, LMRAOA, and VGHHO.The algorithms were tested for classification comparison by using the 12 test datasets given in Table 3.Each algorithm was run 30 times to obtain the average classification values, and the comparison results are shown in Table 8.
From the results in Table 8, the classification performance of SOSCHOA was only better than that of VGHHO on the lung.Regarding carcinoma, the classification performance of SOSCHOA was the worst.For all other datasets, the classification performance of SOSCHOA was better than that of the other metaheuristics.This indicates that SOSCHOA has a significant advantage over the different algorithms in terms of classification performance.Also, the running time of the SOSCHOA algorithm is well within the acceptable range.To further demonstrate the effectiveness of the SOSCHoA algorithm, it was compared with the five different heuristic optimization algorithms.Table 9 shows the average classification accuracy of these five algorithms.Table 10 indicates the number of features selected for these five algorithms.Table 11 shows the average running time of these five algorithms.Table 9 shows that on the warpPIE10P dataset, WOA classification accuracy was the best, and SOSCHoA classification accuracy ranked third.On the lung and Lung_Cancer datasets, FA classification accuracy was the best, and SOSCHoA classification accuracy ranked second.For the Carcinom and nci9 datasets, HHO classification accuracy was the best, and SOSCHoA classification accuracy ranked second.SOSCHOA's classification performance for all other datasets was better than that of the other metaheuristics.This indicates that SOSCHOA has a significant advantage over the different algorithms in terms of classification performance.
As seen from Table 10, the number of features selected by SOSCHoA is lower on all test datasets compared to the five algorithms, FA, FPA, WOA, MRFO, and HHO.From Tables 9 and 10, it can be seen that the SOSCHoA algorithm is the most efficient.
As seen from Table 11, the running time of the SOSCHoA algorithm is still relatively long due to the larger search space in high-dimensional data.However, the running time of the SOSCHOA algorithm is well within the acceptable range.
In summary, the SOSCHoA algorithm has a robust search capability and can find a relatively small and high-quality subset of features.Secondly, it shows that the SOSCHoA algorithm can improve the classification accuracy in the selected feature subset.Finally, it also indicates that the chosen feature subset by the SOSCHoA algorithm still has room for further reduction and improvement in classification accuracy.This also provides a feasible study for subsequent research and the design of new innovative mechanisms to eventually reduce the size of the feature subset and further improve the model's classification performance.In summary, among the six  www.nature.com/scientificreports/results in the lung dataset.In the Leukemia_1 dataset, SOSCHOA and the WOA and MRFO algorithms were found to be identical overall, respectively.These results show that the SOSCHoA algorithm usually provides statistically significant performance improvements.However, we also note that on specific datasets, the performance of SOSCHoA is similar to that of the other algorithms.This may be due to the characteristics of these datasets or the inherent advantages of different algorithms in dealing with specific problems.

Conclusion
When dealing with high-dimensional classification data, the complex interactions between features pose higher challenges to feature selection algorithms.The traditional CHoA has limitations in fast convergence and accurate optimization search, and it is difficult to identify and eliminate irrelevant and redundant features efficiently.To overcome these limitations and improve the global search capability and convergence efficiency of the algorithm, after an in-depth study of the core mechanism of CHoA, this paper proposes a new algorithm: Social Coevolution and Sine Chaotic Oppositional Learning Chimp Optimization Algorithm (SOSCHoA).The improvements of the SOSCHoA algorithm are mainly reflected in the following aspects:   • Introducing the social coevolution strategy, which enhances the information exchange between individu- als, extends the search subspace and dynamically adjusts the balance between local exploration and global exploitation.• Using a sine chaotic opposition learning increases the diversity of the population.It improves the ability of the algorithm to jump out of the local optimum and approach the global optimal solution.• Experimental results show that SOSCHoA significantly outperforms existing algorithms such as CHoA, DLFCHOA, PIL-BOA, BBOA, VGHHO, FA, FPA, WOA, HHO, and MRFO in terms of convergence rate, classification accuracy, and feature approximation ability.These results confirm the significant advantages of SOSCHoA in improving classification accuracy and reducing the number of features.However, regarding reducing the number of feature dimensions, the SOSCHoA algorithm still needs to catch up on datasets such as warpPIE10P, lung, Carcinom and nci9.
Future research will focus on further optimizing the position update equation and the global exploration mechanism to improve the high-dimensional classification optimization capability of SOSCHoA, especially when dealing with datasets with higher feature dimensions.

Figure 2 .
Figure 2. Variation of SOSCHoA classification accuracy versus the number of selected features.

Figure 4 .
Figure 4. Comparison of the convergence curves of the SOSCHoA algorithm with the other eleven compared algorithms.

Figure 5 .
Figure 5.Comparison of the convergence curves of the SOSCHoA algorithm with the other eleven compared algorithms.

Table 1 .
Research on meta-heuristic algorithms for high-dimensional data.

of dataset Number of samples Number of features Number of classification labels
Figure 1.Experimental framework.

Table 4 .
Comparison algorithm parameter settings.

Table 5 .
Comparison of the classification performance of SOSCHoA with CHoA and DLFCHOA.

Table 6 .
Comparison of the number of selected features and running time (/t) for SOSCHoA with CHoA and DLFCHOA.

Table 7 .
Comparison of classification accuracy and average fitness value test results of algorithms.

Table 8 .
Analysis of the running time (/s) and classification accuracy of SOSCHoA with different opposing learning strategy algorithms.

Table 9 .
Average classification accuracy performance of SOSCHoA and the other four heuristic optimization algorithms.

Table 10 .
Average number of features selected for SOSCHoA and other heuristic optimization algorithms.